[mlir] Return vectorized values instead of replacing #144158

Max191 · 2025-06-13T20:36:17Z

Updates the linalg::vectorize function to return a FailureOr<VectorizationResult> containing the values to replace the original operation, instead of directly replacing the original operation. This aligns better with the style of transforms used with the TilingInterface, and gives more control to users over the lowering, since it allows for additional transformation of the IR before replacement.

There was already a VectorizationResult defined, which was used for the internal vectorize implementation using CustomVectorizationHooks, so the old struct is renamed to VectorizationHookResult.

Note for integration: The replacement of the original operation is now the responsibility of the caller, so wherever linalg::vectorize is used, the caller must also do rewriter.replaceOp(vectorizeResults->replacements).

llvmbot · 2025-06-13T20:36:47Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: None (Max191)

Changes

Exposes a new linalg::vectorize function, which populates the replacement values for the vectorized op, instead of immediately replacing the results of the original operation. This allows for more control over vectorization transformations. The old vectorize function remains the same, and simply calls the new function and replaces the op results in its implementation.

Full diff: https://github.com/llvm/llvm-project/pull/144158.diff

2 Files Affected:

(modified) mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h (+14-5)
(modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (+59-56)

diff --git a/mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h b/mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
index 2eef0a06d0eb4..de09dae24eccf 100644
--- a/mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
+++ b/mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
@@ -772,17 +772,26 @@ LogicalResult deallocateGPUPrivateMemory(OpBuilder &, Value /*buffer*/);
 bool hasVectorizationImpl(Operation *);
 
 /// Emit a suitable vector form for an operation. If provided,
-/// `inputVectorSizes` are used to vectorize this operation. `inputVectorSizes`
-/// must match the rank of the iteration space of the operation and the sizes
-/// must be smaller or equal than their counterpart interation space sizes, if
-/// static. `inputVectorShapes` also allows the vectorization of operations with
-/// dynamic shapes.
+/// `inputVectorSizes` are used to vectorize this operation.
+/// `inputVectorSizes` must match the rank of the iteration space of the
+/// operation and the input vector sizes must be greater than or equal to
+/// their counterpart iteration space sizes, if static. `inputVectorShapes`
+/// also allows the vectorization of operations with dynamic shapes.
 LogicalResult vectorize(RewriterBase &rewriter, Operation *op,
                         ArrayRef<int64_t> inputVectorSizes = {},
                         ArrayRef<bool> inputScalableVecDims = {},
                         bool vectorizeNDExtract = false,
                         bool flatten1DDepthwiseConv = false);
 
+/// Vectorize and store new vectorized results in `newResuls`, without replacing
+/// the old `op`.
+LogicalResult vectorize(RewriterBase &rewriter, Operation *op,
+                        SmallVector<Value> &newResults,
+                        ArrayRef<int64_t> inputVectorSizes = {},
+                        ArrayRef<bool> inputScalableVecDims = {},
+                        bool vectorizeNDExtract = false,
+                        bool flatten1DDepthwiseConv = false);
+
 /// Emit a suitable vector form for a Copy op with fully static shape.
 LogicalResult vectorizeCopy(RewriterBase &builder, memref::CopyOp copyOp);
 
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index ff28bd7c48342..3efef3af93fa3 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -2522,13 +2522,8 @@ bool mlir::linalg::hasVectorizationImpl(Operation *op) {
              tensor::InsertSliceOp>(op);
 }
 
-/// Emit a suitable vector form for an operation. If provided,
-/// `inputVectorSizes` are used to vectorize this operation.
-/// `inputVectorSizes` must match the rank of the iteration space of the
-/// operation and the input vector sizes must be greater than or equal to
-/// their counterpart iteration space sizes, if static. `inputVectorShapes`
-/// also allows the vectorization of operations with dynamic shapes.
 LogicalResult mlir::linalg::vectorize(RewriterBase &rewriter, Operation *op,
+                                      SmallVector<Value> &newResults,
                                       ArrayRef<int64_t> inputVectorSizes,
                                       ArrayRef<bool> inputScalableVecDims,
                                       bool vectorizeNDExtract,
@@ -2558,57 +2553,65 @@ LogicalResult mlir::linalg::vectorize(RewriterBase &rewriter, Operation *op,
     }
   }
 
-  SmallVector<Value> results;
-  auto vectorizeResult =
-      TypeSwitch<Operation *, LogicalResult>(op)
-          .Case<linalg::LinalgOp>([&](auto linalgOp) {
-            // TODO: isaConvolutionOpInterface that can also infer from
-            // generic features. Will require stride/dilation attributes
-            // inference.
-            if (isa<ConvolutionOpInterface>(linalgOp.getOperation())) {
-              FailureOr<Operation *> convOr = vectorizeConvolution(
-                  rewriter, linalgOp, inputVectorSizes, inputScalableVecDims,
-                  flatten1DDepthwiseConv);
-              if (succeeded(convOr)) {
-                llvm::append_range(results, (*convOr)->getResults());
-                return success();
-              }
-
-              LDBG("Unsupported convolution can't be vectorized.\n");
-              return failure();
-            }
-
-            LDBG("Vectorize generic by broadcasting to the canonical vector "
-                 "shape\n");
-
-            // Pre-process before proceeding.
-            convertAffineApply(rewriter, linalgOp);
-
-            // TODO: 'vectorize' takes in a 'RewriterBase' which is up-casted
-            // to 'OpBuilder' when it is passed over to some methods like
-            // 'vectorizeAsLinalgGeneric'. This is highly problematic: if we
-            // erase an op within these methods, the actual rewriter won't be
-            // notified and we will end up with read-after-free issues!
-            return vectorizeAsLinalgGeneric(rewriter, state, linalgOp, results);
-          })
-          .Case<tensor::PadOp>([&](auto padOp) {
-            return vectorizeAsTensorPadOp(rewriter, padOp, inputVectorSizes,
-                                          results);
-          })
-          .Case<linalg::PackOp>([&](auto packOp) {
-            return vectorizeAsTensorPackOp(rewriter, packOp, inputVectorSizes,
-                                           results);
-          })
-          .Case<linalg::UnPackOp>([&](auto unpackOp) {
-            return vectorizeAsTensorUnpackOp(rewriter, unpackOp,
-                                             inputVectorSizes, results);
-          })
-          .Case<tensor::InsertSliceOp>([&](auto sliceOp) {
-            return vectorizeAsInsertSliceOp(rewriter, sliceOp, inputVectorSizes,
-                                            results);
-          })
-          .Default([](auto) { return failure(); });
+  return TypeSwitch<Operation *, LogicalResult>(op)
+      .Case<linalg::LinalgOp>([&](auto linalgOp) {
+        // TODO: isaConvolutionOpInterface that can also infer from
+        // generic features. Will require stride/dilation attributes
+        // inference.
+        if (isa<ConvolutionOpInterface>(linalgOp.getOperation())) {
+          FailureOr<Operation *> convOr = vectorizeConvolution(
+              rewriter, linalgOp, inputVectorSizes, inputScalableVecDims,
+              flatten1DDepthwiseConv);
+          if (succeeded(convOr)) {
+            llvm::append_range(newResults, (*convOr)->getResults());
+            return success();
+          }
+
+          LDBG("Unsupported convolution can't be vectorized.\n");
+          return failure();
+        }
+
+        LDBG("Vectorize generic by broadcasting to the canonical vector "
+             "shape\n");
+
+        // Pre-process before proceeding.
+        convertAffineApply(rewriter, linalgOp);
+
+        // TODO: 'vectorize' takes in a 'RewriterBase' which is up-casted
+        // to 'OpBuilder' when it is passed over to some methods like
+        // 'vectorizeAsLinalgGeneric'. This is highly problematic: if we
+        // erase an op within these methods, the actual rewriter won't be
+        // notified and we will end up with read-after-free issues!
+        return vectorizeAsLinalgGeneric(rewriter, state, linalgOp, newResults);
+      })
+      .Case<tensor::PadOp>([&](auto padOp) {
+        return vectorizeAsTensorPadOp(rewriter, padOp, inputVectorSizes,
+                                      newResults);
+      })
+      .Case<linalg::PackOp>([&](auto packOp) {
+        return vectorizeAsTensorPackOp(rewriter, packOp, inputVectorSizes,
+                                       newResults);
+      })
+      .Case<linalg::UnPackOp>([&](auto unpackOp) {
+        return vectorizeAsTensorUnpackOp(rewriter, unpackOp, inputVectorSizes,
+                                         newResults);
+      })
+      .Case<tensor::InsertSliceOp>([&](auto sliceOp) {
+        return vectorizeAsInsertSliceOp(rewriter, sliceOp, inputVectorSizes,
+                                        newResults);
+      })
+      .Default([](auto) { return failure(); });
+}
 
+LogicalResult mlir::linalg::vectorize(RewriterBase &rewriter, Operation *op,
+                                      ArrayRef<int64_t> inputVectorSizes,
+                                      ArrayRef<bool> inputScalableVecDims,
+                                      bool vectorizeNDExtract,
+                                      bool flatten1DDepthwiseConv) {
+  SmallVector<Value> results;
+  LogicalResult vectorizeResult = mlir::linalg::vectorize(
+      rewriter, op, results, inputVectorSizes, inputScalableVecDims,
+      vectorizeNDExtract, flatten1DDepthwiseConv);
   if (failed(vectorizeResult)) {
     LDBG("Vectorization failed\n");
     return failure();

banach-space · 2025-06-16T16:24:04Z

Thanks for sending this!

What is the rationale and the end-goal here?

This allows for more control over vectorization transformations.

Sure, but without tests upstream, this change feels very arbitrary. What is meant to decide whether to use or not the vectorization output? This could be a job for a cost-model?

Max191 · 2025-06-16T19:38:32Z

Thanks for sending this!

What is the rationale and the end-goal here?

This allows for more control over vectorization transformations.

Sure, but without tests upstream, this change feels very arbitrary. What is meant to decide whether to use or not the vectorization output? This could be a job for a cost-model?

I have a use case downstream in IREE, although I haven't sent the code out for it yet. I want to reuse the linalg.generic vectorization implementation for a custom op in IREE, since the vectorization for them is similar. My transformation creates a new generic op with a similar body to the custom op, and then I vectorized the generic, but the results of the generic cannot immediately replace anything, since it is just a step in the vectorization transformation, so I want a way to vectorize and just get the results of the vectorized operation.

Max191 · 2025-06-16T19:40:51Z

Ideally, I wouldn't need to create a new linalg.generic like this, but the vectorization implementations are not very friendly for doing anything different. Perhaps in the future the vectorization implementations could be refactored, but I don't have the bandwidth to take on that task right now.

banach-space · 2025-06-17T11:11:08Z

Thanks for elaborating - that helps clarify the intent.

That said, I don’t see a clear path to testing this upstream. Without tests, this becomes “dead code” as soon as it lands. And dead code - even if small - is a maintenance burden. In general, untested code should be removed (or not merged in the first place). One might argue it’ll be used in IREE, but MLIR maintenance shouldn't depend on context from downstream projects.

I also get the impression there may be an alternative way to approach this - I’ve included some more specific concerns and questions below.

the results of the generic cannot immediately replace anything, since it is just a step in the vectorization transformation

True, the Linalg vectorizer’s structure is a bit convoluted - but at its core, it behaves like a pattern (see VectorizationPattern). So "replacement" semantics still seem appropriate here.

Also, since this is a Linalg vectorizer, its purpose is to take a linalg.* op and lower it to vector.*. Emitting new linalg.generic ops as an intermediate result feels like it steps outside that intended boundary.

My transformation creates a new generic op with a similar body to the custom op, and then I vectorized the generic, but the results of the generic cannot immediately replace anything, since it is just a step in the vectorization transformation, so I want a way to vectorize and just get the results of the vectorized operation.

Got it - but this reads as though you're trying to use the Linalg vectorizer as part of a custom lowering pipeline, with something like:

iree.custom_op → linalg.generic
linalg.generic → vector.*

If that's the case, why can’t step (1) happen in a separate transformation (upstream or downstream)? That would allow the existing vectorizer to be used as intended, replacing linalg.generic with vector ops in step (2).

I want to reuse the linalg.generic vectorization implementation for a custom op in IREE, since the vectorization for them is similar.

OK, to me this implies that you won't be able to avoid some downstream changes for this to work? Could this change also live downstream?

Perhaps in the future the vectorization implementations could be refactored, but I don't have the bandwidth to take on that task right now.

Totally understandable. But even so, I think we need to at least clarify what the longer-term trajectory would be, so that any short-term workaround doesn’t create architectural friction.

Just to reiterate: I’d like to understand the use case better and help you find a path forward. I do appreciate that this is a relatively small change, but it seems to cut against MLIR’s general principles. That said, maybe there are existing examples or infra for invoking patterns without replacement? If so, happy to explore those. We can also ask on Discourse.

-Andrzej

(*) It also supports tensor.pad, tensor.extract and tensor.insert_slice, but these are "special". And, Pack/Unpack have been moved from Tensor to Linalg.

Max191 · 2025-06-17T13:59:51Z

That said, I don’t see a clear path to testing this upstream

The new callback is directly called in the existing linalg::vectorize transform now, so this new function is tested by all of the vectorization tests in MLIR already. Was there some other type of test that you had in mind?

Max191 · 2025-06-17T14:05:55Z

If that's the case, why can’t step (1) happen in a separate transformation (upstream or downstream)?

I suppose it could, although in my use case, it is safer to do this as a single transformation because it ensures that no new tensor ops remain in the IR. One of the requirements of the transformation is that the custom op is bufferized, so I don't want the tensor ops that I inject to remain in the IR after the transformation. If I do the conversion to linalg.generic as a separate transformation, then I will have to rely on separate folders for vector.transfer_read/transfer_write pairs to ensure all tensor ops are folded away after calling linalg::vectorize.

Max191 · 2025-06-17T14:16:06Z

That said, maybe there are existing examples or infra for invoking patterns without replacement?

I believe this is not uncommon here in MLIR. The TilingInterface utilities, for example, don't do any replacement:

llvm-project/mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h

Lines 152 to 156 in 414710c

    
           /// Method to tile an op that implements the `TilingInterface` using 
        
           /// `scf.for` for iterating over the tiles. 
        
           FailureOr<SCFTilingResult> tileUsingSCF(RewriterBase &rewriter, 
        
                                                   TilingInterface op, 
        
                                                   const SCFTilingOptions &options);

The callback returns an SCFTilingResult, which contains the replacement values for the op that was tiled.

I think the reasoning for having transformations like this is that it allows the user of the transform to apply additional transformation to the results of the transform before doing the replacement, which is generally better than adding transformation after replacing IMO, since you don't want to replace ops until the transformation is fully complete.

hanhanW

TLDR: @Max191 I think it is better to return std::optional<VectorizationResult> like other transform methods, but let's see how the discussion goes before you make other changes.

I just came back from vacation and I'm still catching up the threads. If I read it correctly, there are few concerns:

The expectation of the return values from vectorize() method.
Test coverage for the new code.
Test coverage between upstream and downstream projects.
The longer-term trajectory for vectorization.

(1) (2) (3) are all important as others could follow bad PRs in the future, and we should have the same review bar as possible as we can. I need to chat with @Max191 about downstream changes, but I can share my thought that is orthogonal to the downstream implementation details.

RE (1): I've been thinking about this when I saw other PRs about refactoring "patterns" and "transforms". TLDR is that I'm +1 on returning transformed IR from methods. The line between patterns and functions to me is that whether it returns the final results or not. People can use functions to build patterns. Either just use a single function, or assemble the pieces from multiple functions. This happens not only in linalg dialect, but also other dialects. As we expose the vectorize method today, I'd expect it to return std::optional<VectorizeResult>. We have such structure in other linalg transfroms, e.g., Tiling, Promotion, etc, but not vectorize(). I think we had the same intention for the vectorization, but it got lost at a moment. Thus, I took a stab at digging into the history.

When we lived in the very old world, we developed most transformations using pattern-rewriting framework with linalg transform filters, see RFC for the retirement plan, and it includes tiling, promotion, vectorization, etc. We've spent a lot of time on moving core functions as C++ functions and reuse them everywhere, including transform ops. Tiling is a good example to follow. Here is the oldest commit that shows the change. It returned void in the first place as it was the first step.

At a moment, we had std::optional<*Result> style like other TilingResult, TileAndFuseResult, etc.

/// Emit a suitable vector form for a Linalg op with fully static shape.
struct VectorizedLinalgOp {
  SmallVector<Value> tensorResults;
  VectorizedLinalgOp &operator=(const VectorizedLinalgOp &) = default;
};
Optional<VectorizedLinalgOp> vectorizeLinalgOp(OpBuilder &builder,
                                               Operation *op);

Later, a contributor refactored the code the handling of the result to a referenced function argument for some reasons: c1a4cd5 and it becomes

LogicalResult vectorizeLinalgOp(OpBuilder &builder, Operation *op,
                                SmallVectorImpl<Value> &newResults);

We still had tracked the return values until another refactoring. As you see in the change, all other transformations return transformed op except the vectorize() method. I don't find any comments from the review, and it may because we wanted to implement a better vectorizer so we're being conservative. Or maybe it is just because it looked straight-forward in the change itself. E.g., it's easier because you don't need to introduce a new struct, and it'd be very similar to the deprecated one. I don't know.

However, if we get back to what is the line between patterns and functions, I think we should return transformed IRs from the functions; it is pattern's responsibility to handle the return value and do the further transformation within their scope.

Note: other transform like generalizeNamedOp performs the replacement and returns the new ops. I think the vectorize method can follow the same API.

RE (2): If I read code correctly, the current implementation does not introduce dead code because they all go through the old vectorize method and it calls the new vectorize method. It is not obvious IMO. What I'd argue is should we return std::optional<SmallVector<Value>> or std::optional<VectorizationResult> instead? It'd break downstream users, but I think it is a reasonable move. What do you think?

RE (3): I think all the code is well-tested in the upstream, and there are no dead codes atm. Like what I mentioned above, I'd suggest returning std::optional<VectorizationResult> which makes it clearer. One of the things I like is that the upstream has good test coverage for vectorization, so I only need few tests in the IREE project. E.g., make sure that vector size inference is working when masking is enabled in IREE. The vector size inference could be IREE specific, so it's not upstreamed. How I form it is like the core functionality is tested in upstream, and IREE just has a slightly higher level of tests within the pass scope.

RE (4):

I'm happy to see the discussion and I'm happy to provide feedbacks. There was a discussion long time ago, but we did not make progress so far. I hope that the above explanation could address your concern and this is not becoming a blocker.

I've been wanting to investigate how to move the vectorization to the next chapter, as we may have more future changes on linalg ops and people'd like to extend the existing vectorizer. My rough idea is that we need an interface; any operation implements the interface can be vectorized by our new vectorizer system using the interface. Roughly, we'd need three interface methods. One for reading data from tensor/memref to vectors; it returns the loaded vectors. Another method is for writing vectors into destination, which takes a sequence of vectors. The other is similar to the vectorize() method, which takes the input vectors, applies some computation, and returns the result vectors. E.g., the linalg op uses its body to perform the computation.

However, what's been confusing me is whether we should enforce vectorizable ops to implement DPS interface, as we may need to pass input vector size for writing vectors. Today, they are well-defined for linalg ops because the inits are passed and we capture the iteration domain sizes in the indexing maps. I'm happy to chat with you offline or we can prepare RFC together, and hopefully (4) won't become a blocker for the change.

banach-space · 2025-06-18T15:30:52Z

Thank you both for the additional context — that’s exactly what I was missing. Special thanks to @hanhanW for the detailed and generous overview of related discussions; I found it extremely helpful 🙏🏻

To quickly clarify my earlier point about “dead code”:
What I meant is that, with this change, there’s nothing upstream preventing us from reverting this:

LogicalResult mlir::linalg::vectorize(...) {
  SmallVector<Value> results;
  LogicalResult vectorizeResult = mlir::linalg::vectorize(results, ...);
}

back into something like:

LogicalResult mlir::linalg::vectorize(...) {
  SmallVector<Value> results;
  // inline mlir::linalg::vectorize(results, ...);
}

In other words, the new overload of vectorize (and the indirection it introduces) doesn’t provide any distinct, reusable functionality (from the point of view) upstream. Hence my concern - poorly scoped abstractions can accumulate and become hard to reason about over time. (Admittedly, “dead code” wasn’t the best way to describe that - sorry for the confusion.)

I believe this is not uncommon here in MLIR. The TilingInterface utilities, for example, don't do any replacement:

Thanks for pointing this out - my bad for not paying closer attention to similar patterns elsewhere. It does seem like this change is trying to bring the vectorize API more in line with the structure used in other transformations, e.g.:

tileUsingSCF, used by transform::TileUsingForOp::apply
mlir::linalg::generalizeNamedOp, used by transform::GeneralizeOp::applyToOne

That said, as @hanhanW mentioned, generalizeNamedOp does perform the replacement inline - see here.

That seems inconsistent, and I’d love to understand the rationale. Clarifying that would help confirm that this change is indeed a step toward a more uniform transformation interface - which I agree is a worthwhile goal.

Like what I mentioned above, I'd suggest returning std::optional which makes it clearer.

Wouldn’t FailureOr be more appropriate here? Especially if we’re aligning with similar transformation interfaces elsewhere.

I'm happy to chat with you offline or we can prepare RFC together, and hopefully (4) won't become a blocker for the change.

That sounds great — I’d be happy to sync offline as well.

All in all, I’m happy for you to move forward with this change. That said, my one request would be to ensure that only a single vectorize entry point is exposed upstream.

Based on the discussion, the following signature seems sufficient to serve both upstream and downstream needs:

FailureOr<VectorizationResult>
vectorize(RewriterBase &rewriter,
          Operation *op,
           ArrayRef<int64_t> inputVectorSizes = {},
           ArrayRef<bool> inputScalableVecDims = {},
           bool vectorizeNDExtract = false,
           bool flatten1DDepthwiseConv = false);

This would also:

Bring us closer to a design similar to tileUsingSCF,
Avoid maintaining multiple high-level entry points into the vectorizer.

How does this sound?

Max191 · 2025-06-18T15:57:03Z

That SGTM! I was thinking the same thing about FailureOr. An additional benefit of using FailureOr, which I hadn't realized before, is that there will be less disruption in downstream projects, since it will be compatible with old LogicalResult implementations. I think everyone will be happy with this implementation.

Big thank you to Hanhan, for adding great additional context!

dcaballe · 2025-06-18T16:25:26Z

Nice discussion! I haven't had the time to dive deep into the details but I'm generally +1 to making the vectorizer more modular and reusable. We talked about introducing interfaces to decouple the information of the iteration space to be vectorized from the actual ops, which is something we could explore. My only concern would be if we head towards implementing an ad-hoc "conversion rewriter" where IR modifications have to live in data structures on the side, or we set the expectations that the vectorizer can't materialize changes directly in the IR. That would be difficult to justify and maintain. Happy to provide feedback on a specific example, if you have one you can share.

hanhanW · 2025-06-18T17:44:16Z

I believe this is not uncommon here in MLIR. The TilingInterface utilities, for example, don't do any replacement:

Thanks for pointing this out - my bad for not paying closer attention to similar patterns elsewhere. It does seem like this change is trying to bring the vectorize API more in line with the structure used in other transformations, e.g.:

tileUsingSCF, used by transform::TileUsingForOp::apply

mlir::linalg::generalizeNamedOp, used by transform::GeneralizeOp::applyToOne

That said, as @hanhanW mentioned, generalizeNamedOp does perform the replacement inline - see here.

That seems inconsistent, and I’d love to understand the rationale. Clarifying that would help confirm that this change is indeed a step toward a more uniform transformation interface - which I agree is a worthwhile goal.

Honestly, I don't know which is better and what the rationale is. Maybe different contributor has different preference. If I had to choose, I'd not perform the replacement. Because it provides more control to other users. I'm not sure if the replacement would become an issue or not when people build complicated patterns and dialect conversion or type converter is involved.

Like what I mentioned above, I'd suggest returning std::optional which makes it clearer.

Wouldn’t FailureOr be more appropriate here? Especially if we’re aligning with similar transformation interfaces elsewhere.

Yes, it can be FailureOr. I somehow like std::optional better for no reasons. I don't know how to line up FailureOr in pattern rewriting and feel that most codes would wrap it with failed(); it loses the error message. E.g.,

if (failed(vectorize(...)) {
  return failure();
}

But yeah, FailureOr seems more appropriate in this case.

Based on the discussion, the following signature seems sufficient to serve both upstream and downstream needs:
FailureOr<VectorizationResult>
vectorize(RewriterBase &rewriter,
          Operation *op,
           ArrayRef<int64_t> inputVectorSizes = {},
           ArrayRef<bool> inputScalableVecDims = {},
           bool vectorizeNDExtract = false,
           bool flatten1DDepthwiseConv = false);
This would also:

Bring us closer to a design similar to tileUsingSCF,

Avoid maintaining multiple high-level entry points into the vectorizer.

How does this sound?

Thanks, it sounds good to me!

Signed-off-by: Max Dawkins <[email protected]>

Max191 · 2025-06-19T12:57:36Z

I have updated the implementation to return a FailureOr<VectorizationResult>, and no longer do the replacement. @hanhanW @banach-space please take another look when you get a chance!

hanhanW · 2025-06-19T16:26:44Z

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp

-    rewriter.eraseOp(op);
-
-  return success();
+  return VectorizationResult({results});


Since you're using braced initialization lists, I think this can be VectorizationResult{results} or just {results}.

hanhanW · 2025-06-19T16:30:19Z

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h

+/// also allows the vectorization of operations with dynamic shapes. Returns
+/// a VectorizationResult containing the results of the vectorized op, or
+/// failure if the transformation fails.
+FailureOr<VectorizationResult>


I'd move Returns ... to the first sentence, which is a high level overview of the function. And maybe remove Emit a suitable vector form for an operation.?

banach-space

Thanks for the updates - this is roughly what I expected 🙂

I’ve posted a few general discussion points, but they’re non-blocking. The main issue I'd like to flag is naming (VectorizationHookResult vs. VectorizationResult) - but naming is hard 🙂

By the way, @Max191 - I noticed you force-pushed and rewrote the history. Please try to avoid that, and keep in mind:

https://llvm.org/docs/GitHub.html#rebasing-pull-requests-and-force-pushes

From experience, many of us do rebase and force-push, but without squashing (i.e., we preserve individual commits). I'm personally fine with that, as long as the original commits are retained - otherwise it becomes harder to follow the review context, especially for post-commit review.

Not a big deal, just a kind request for future PRs 🙂
And as for rebasing strategies in general - that's a topic better suited for Discourse.

banach-space · 2025-06-19T17:09:38Z

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h

+/// Transformation information returned after vectorizing.
+struct VectorizationResult {
+  /// Results of the vectorization transform to replace the original operation.
+  SmallVector<Value> replacements;
+};


Not quite "transformation information" ;-)

Do we need a vector of results? I don't want to sound "limiting" or call for a bigger refactor in this PR, but from what I recall, the vectorizer always gets an Op with one result and returns a vectorized result. Is this true? I just suspect that a single Value would be sufficient. This could be something for a follow-up PR.

VectorizationResult -> VectorizedResult? This name is a bit too close to VectorizationHookResult and the distinction is not well documented. My suggestion is not that much better, but it "hints" it's just the actual vectorized Op (as opposed to "vectorized Op + status flag"). I will try to come back with sth better :)

Max191 requested review from banach-space, dcaballe, hanhanW, nicolasvasilache, Groverkss and rengolin as code owners June 13, 2025 20:36

llvmbot added mlir:linalg mlir labels Jun 13, 2025

rengolin requested a review from shahidact June 13, 2025 20:42

Max191 force-pushed the expose-non-replace-vectorize branch from fe43742 to 6d579fb Compare June 16, 2025 14:11

hanhanW requested changes Jun 18, 2025

View reviewed changes

[mlir] Return vectorized values instead of replacing

b8a783b

Signed-off-by: Max Dawkins <[email protected]>

Max191 force-pushed the expose-non-replace-vectorize branch from 6d579fb to b8a783b Compare June 19, 2025 12:49

Max191 requested a review from ftynse as a code owner June 19, 2025 12:49

Max191 changed the title ~~[mlir] Expose linalg vectorization without replacement~~ [mlir] Return vectorized values instead of replacing Jun 19, 2025

Max191 requested a review from hanhanW June 19, 2025 12:56

hanhanW reviewed Jun 19, 2025

View reviewed changes

banach-space reviewed Jun 19, 2025

View reviewed changes

[mlir] Return vectorized values instead of replacing #144158

Are you sure you want to change the base?

[mlir] Return vectorized values instead of replacing #144158

Conversation

Max191 commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

banach-space commented Jun 16, 2025

Uh oh!

Max191 commented Jun 16, 2025

Uh oh!

Max191 commented Jun 16, 2025

Uh oh!

banach-space commented Jun 17, 2025

Uh oh!

Max191 commented Jun 17, 2025

Uh oh!

Max191 commented Jun 17, 2025

Uh oh!

Max191 commented Jun 17, 2025

Uh oh!

hanhanW left a comment

Choose a reason for hiding this comment

Uh oh!

banach-space commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Max191 commented Jun 18, 2025

Uh oh!

dcaballe commented Jun 18, 2025

Uh oh!

hanhanW commented Jun 18, 2025

Uh oh!

Max191 commented Jun 19, 2025

Uh oh!

hanhanW Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

hanhanW Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

banach-space left a comment

Choose a reason for hiding this comment

Uh oh!

banach-space Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Max191 commented Jun 13, 2025 •

edited

Loading

llvmbot commented Jun 13, 2025 •

edited

Loading

banach-space commented Jun 18, 2025 •

edited

Loading